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(57) Abstract \ 

A multi-processor apparatus which includes an array of separately addressable memory units (IS) and an array of 
separately addressable processors (14). A first unidirectional bus (BUS I) delivers data trom a selected 
lected memory unit. A second unidirectional data bus (BUS II) delivers data from a selected memory unit to a .elected 
processor. Arbitor circuits 00. ?T> control the How of data to these data buses. 
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MULTI-PROCESSOR APPARATUS 
Technical Field 

The present invention relates to multi- 
processor apparatus. A mul ti— processor apparatus 
5 includes a plurality of processors for processing 
digital data and is especially suitable for use in 
processing digital image signals. Two or more 
processors can operate on data at the same time, 
thereby increasing data throughput. 
10 Background Art 

Where large amounts of digital data need to 
be processed, a mul ti— processor apparatus is often 
suitable for use. . One particular application where 
mul t i— processor apparatus is used is in digital image 
15 processing. Digital image processing is used to 
perform image enhancement processing on a digital 
image to produce ah enhanced digital image. This 
enhanced digital image is read out from memory and 
provided to a high speed "scan" printer. Large 
20 amounts of data must be processed. In order to 
increase the throughput rate, mul ti— processor 
apparatus having an array of processors is used. 
These processors are often microcomputers. Because 
of the amount of data involved and the need for 
25 increased throughput , the use of mul ti— processor 
apparatus is becoming more frequent. 

A typical prior art mul ti— processor 
apparatus architecture is shown in Fig. 1. It 
operates^Jjy sharing a single bus between processors 
30 and a single large main memory (data memory). Each 
processor makes a request to gain cor.trol of the bus 
when it needs access to a location in the main 
memory. During each data transaction all other 
processors which are not busy procersing data must 
35 wait for the bus to again become fjee. An arbitor 
circuit (not shown) establishes the order in which 
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the processors can gain access to the bus. 
Throughput (data transfer rate) increases- as the 
number of processors is increased. This increase in 
throughput continues only up to a point. Thereafter, 
5 an increase in the number of processors actually 
decreases the throughput. 
Disclosure of the Invention 

The object of this invention is to provide a 
parallel processor apparatus with increased 
10 throughput . 

This object is achieved by apparatus for 
processing digital image signals, characterized by: 
an array of separately addressable memory units, each 
including input and output data storage means; an 

15 array of processors each including input and output 
data storage means; first data transfer means 
including a first data bus, and means for 
"transferring data from output data storage ' means of a 
selected processor via the first data bus to the 

20 input data storage means of a selected memory unit; 
and second data transfer means including a second 
data bus, and means independent of said first 
transferring means for transferring data from output 
data storage means of a selected memory unit via the 

25 second data bus to an input data storage means of a 
selected processing unit. 

The use of two separate data buses, the 
first for delivering data from the processors to the 
memory units and the second for delivering data from 

30 the^memory units to the processors, increases 

throughput. Each data bus is used only f or the short 
duration required to transfer data between data 
storage means. 

The operation of each memory br > is 

35 controlled by separate arbitor circuits. Each 

arbitor circuit is independent of the other circuit. 
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Thus , with this arrangement, data can be transferred 
from a processor to a memory unit while at the same 
time data are being transferred from a memory unit to 
a processor. 
5 Brief Description of the Drawings 

Fig. 1 is a block diagram of a conventional 
prior art digital image processing apparatus; 

Fig. 2 shows in block form the elements of 
digital image processing apparatus in accordance with 
10 the invention; 

Fig. 3 is a block diagram of portions of the 
apparatus of Fig. 2 which illustrate the transfer of 
data from an output buffer of a selected processor to 
an input buffer of a selected memory unit; and 
15 Fig. 4 is a schematic diagram of the arbitor 

circuit for BUS I. 

Modes of Carrying Out the Invention _ — 
Turning now to Fig. 2 t where a 
mul t i— processor apparatus 10 in accordance with the 
20 invention is shown. The apparatus is particularly 
suitable for processing digital images and will be 
described in connection with such processing. The 
apparatus 10 includes an array 12 having a plurality 
(n) of processors 14 and an array 16 having a 

25 plurality (m) of memory units 18. At this point it 
will be noted that the number n does not necessarily 
equal the number m. All processors 14 must be able 
to access any one of the memory units 18. Associated 
with each processor 14 is an input latch 20 and an 

30 output bof.fer 22. Each memory unit 18 includes an 

i f iput latch 26 and an output buffer 28. Buffers and 
latches are data storage devices. A buffer is a 
device that transmits the signal at its input to its 
;utput. A latch is a device that stores the signal 

35 at its input in response to a clock signal. These 
buffers and latches include tri-state logic devices. 
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Tri-state logic devices or gates are commonly used in 
the interconnection to a common bus. When a control 
line is enabled, the tri— state devices are coupled to 
the bus. When the control line is disabled, the 
5 tri-state devices act as a high-output impedance and 
are decoupled from the bus. BUS I is a 
unidirectional bus and is associated with the output 
buffers 22 of the processors and the input latches 26 
of the memory units. An arbitor circuit 30 controls 
10 the transfer of data on BUS I from processors to 
_ memory. units and an arbitor circuit S2 controls the 



transfer of data from memory units 18 to the 
processors 14. When a particular processor is ready 
to process data, it raises the level of a signal on a 

15 lead labeled "Grant Request." A high-level grant 
request signal is provided to arbitor circuit 30. 
The arbitor circuit 30, as will be described later, 
is arranged so that each processor has almost equal 
priority to gain access to BUS- 1. Arbitor circuit 30 

20 arbitrates among all processors producing grant 
request signals and in accordance with a 
predetermined order sequentially transfers data 
between the output buffer 22 of each selected 
processor and the input latch 26 of the corresponding 

2 5 memory unit. 

The arbitor 32 functions independently of 
the circuit 30 and controls the flow of data from the 
output buffer 28 of a selected memory unit via BUS II 
to the input latch 20 of a selected processor 14. 

30 BUS II is a unidirectional bus and is associated with 
the input latches 20 of the processors 14 and the 
output buffers 28 of the memory units 18. The 
arrangement of two unidirectional buses allows full 
duplex operation, that is, at any given time a 

35 processor can be transferring data to a memory unit 
whi^e at the same time, a memory unit can be 
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transferring data to another processor. Each bus is 
used only for the short time required to transfer 
data between data storage means. Each bus is 
operated independently of the other bus. By means of 
5 this arrangement, throughput can be significantly 
increased . 

Other elements of the digital image 
processor apparatus 10 will now be briefly 
discussed. Prior to image processing, a digital 
10 image corresponding to a light image must be stored 
in the memory planes 24 of the memory units 18. The 
digital pixel value stored at each memory location in 
a memory plane 24 represents brightness or a gray 
scale level. For a color digital image, each digital 
15 image pixel can have 24 bits; 8 bits gray scale for 

red, 8 bits gray scale for green and 8 bits gray 
— scale for blue. One of the • processors can be 

dedicated to receive digital image data and deliver 
them to memory plane locations. Image sensors (not 
20 shown) operated by their own microcomputer produce 

analog signals corresponding to a color component of- 
a light image. These image sensors can be, for 
example, CCD image area sensors. A conventional 
digitizer (analog/digital converter) digitizes these 
25 analog signals and applies them to the dedicated 

processor. This processor gains access to BUS I, and. 
applies image pixel data and an address onto BUS I. 
This address includes not only the particular memory 
unit to be accessed but also the memory location in 
30 the" memory plane of such- unit where the digital image 
pixel data are to be stored. For an example of a 
system for producing digital images and storing them 
in memory locations of a memory plane, see commonly 
assigned in International Patent Application No. 
35 PCT/US86/00399 and claiming the priority of U.S. 
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Patent Application Serial No. 710,242. filed March 
11, 1985 in the name of Milch. 

The purpose of the array of digital image 
processors 14 (other than the dedicated processor 
5 just discussed) is to produce an enhanced digital 
image. A printer 50 responds to this enhanced 
digital image on a digital pixel by digital pixel 
basis to produce an output print which is more 
suitable for viewing than if image processing had not 
^ 0 taken place. Digital image pro c essing i s well know n 
and often is used in accordance with grain supression 
algorithms, edge enhancement algorithms and tone 
scale algorithms. Examples of such digital image 
processing algorithms are set forth in commonly 
15 assigned U.S. Patent Nos . 4,399,461, 4,442,454, and 
4,446,484. The printer 50 can be provided by a laser 
printer. Image processing algorithms, as well as 
other process control algorithms , necessary to 
control the processors are provided ..in memories (not 
shown) associated with each processor. 

After all the digital image processing has 
been completed, an enhanced digital pixel is 
delivered to a particular one of the memory units 
18. This memory unit causes enhanced digital pixels 
25 to be sequentially delivered to printer 50. 

Turning now to Fig, 3, there is shown an 
output buffer 22 of a selected processor 14 and an 
input latch 26 of a selected memory unit 18. We will 
assume at this point that the processor 14 for this 
30 output buffer has already produced a high-level grant 
signal -oo^a Grant Request lead and provided it as an 
input to arbitor circuit 30. Also the selected 
processor has provided an address as an input to its 
output buffer 22. The selected processor 14 also 
35 provides its own return address as an input to buffer 
22 so that the selected memory unit 18 will know the 



20 
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processor return address. This return address is 
sometimes referred to as a "packet return address." 
When the grant request is honored, a grant signal is 
produced by the circuit 30. The grant signal is 
5 provided to the output buffer 22 of the requesting 
processor 14. Data are then applied on BUS I from 
buffer 22 and delivered to all the input latches of 
the memory units 18. The desired memory unit is 
decoded by decode logic 33 from the address. If the 

10 unit is not busy, the grant signal is gated to that 
memory unit. In this way, these data are only 
entered into the latch of the addressed or selected 
memory uni t . 

The busy signal is produced by logic 

15 associated with a memory unit and indicates that it 
is unable to accept data. The arbitor circuit 30 
will assume the grant request has been serviced and 
'continue 'to service all the other grant request 
signals- The unserviced processor will continue to 

20 produce a grant request signal. Thereafter circuit 
30 will repeat the process discussed above and will 
service this processor if the addressed memory unit 
is not busy. The operation of circuit 30 will be 
described in detail later with reference to Fig. 4. 

25 Decode logic circuitry is not needed for the 

BUS II arbitor. The reason for this is that when a 
processor requests data, it will remain idle until 
data is delivered to it from a memory unit. 

As shown in Fig. 3, a low level signal to 

30 the output, buffer of the. selected processor is an 

enabling signal to tri-3tate logic in such a buffer 
causing data to be transferred to BUS I. The small 
circle at the input of the buffer 22 indicates it 
responds to a low leve*. signal. The small triangle 

35 or wedge in the latch 26 indicates that it is enabled 
by a positive going edge signal- At this point we 
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will assume that the tri— state logic in the output 
buffer has applied the memory address, data and 
processor address onto BUS I. Thereafter, a rising 
edge i3 applied by circuit 30 through the decoding 
5 logic 33 to the selected input latch 26. All data on 
BUS I are latched into such selected input latch 26. 



sake of explanation assume that the addressed memory 
unit has been instructed to deliver data from an 

10 addressed memory - location in memory plane 24 to — 

output latch 26. After such data are stored in latch 
26, logic associated with the memory plane produces a 
high— level grant request signal and the output buffer 
is loaded with data from the memory location and the 

15 processor address. When the grant request to arbitor 
circuit 32 is honored, these data and the processor 
address are applied from the output buffer onto BUS 
II. Since the 'desired processor is not busy but 
waiting for data, the data are then delivered to the 

20 input latch 20 of the processor indicated by the 

packet return address via similar decoding logic as 
described above. This processor having received 
data, then performs an appropriate operation in 
accordance with a stored algorithm in a stored 

25 program. 



of arbitor circuit 30 is shown. There are provided 
two banks of flip/flops, 78 and 79. The first bank 
78 receives the grant request signals and the second 

30 bank 79^produces the grant signals. All the 

flip/flops in banks 78 arid 79 are D— type latches. 
D— type latches change their output when a rising edge 
is present at a clock input and assume the value of 
the signal applied to the terminal labeled D. Thus, 

35 if a D terminal is high when a rising clock edge is 



Returning now to Fig. 2, we will for the 



Turning now to Fig. 4, a schematic diagram 
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applied, the state of the flip/flop is Q = 1, that is 

Q will be high and Q low. If the D terminal is low, 
the state of the flip/flop is Q = 0, that is Q will be 

5 _ 

low and Q high- 
Each flip/flop has terminals marked PR (preset) and 
CL (clear) respectively. A low level signal on the 
PR terminal changes the flip/flop to the state Q = 1 

10 and a low level signal on the terminal CL changes the 
flip/flop to the state Q = 0. These two inputs 
override any input signal on the D terminal and are 
independent or asynchronous of the clock signal. 

Six NOR gates are included in circuit 30, A 

IS NOR gate will produce a high level output (logic "1") 
only if all inputs are low. If even only one input 
is high, it will produce a low level output (logic 
"0"). Now as shown in Fig. 4, there are, six 
processors (n = 6). There are six separate grant 

20 request lines, one from each processor 14. Each 

grant request line is connected to a D terminal of a 
flip/flop in bank 78. The circuit 30 has six 
separate grant lines, one for each processor 14. 

Two examples will be used to describe the 

25 operation of circuit 30. First let us assume that a 
high level grant request signal is applied only on 
lead Grant Request l\ It is applied to the D 
terminal of a flip/flop 80a in bank 78. It should be 
noted that bank 78 includes six D-type flip/flops 

30 80a-F'. *-~%t this time, further assume all the other 
flip/flops 80(b-f) receive low-level grant request 
signals. The initial state of each flip/flop in the 
bank 78 Is Q = 0. A NOR gate 82 receives as separate 
inputs the Q output of each flip/flop in bank 78. 
35 NOR gate 82 provides a high— level signal to an AND 
gate 84. Clock signal <t> from a stable clock 
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circuit (not shown) passes through the AND gate 84 
and is delivered to the clock input terminal of each 
flip/flop in the bank 78. In response to the rising 
edge of clock signal $ f only the flip/flop 80a 
5 changes state. Its changed state is Q = 1. A 

high-level input is thereby provided to NOR gates 86 
(b- f) and 82. NOR gate 82 switches and provides a 
low-level output to AND gate 84 which inhibits 
further clock signals from passing into the clock 

10 inputs of the flip7flops 80(a-f). As flip/flop 80a 

changes state, it provides a low-level input to the D 
terminal of a flip/flop 90a in bank 79. As shown, 
bank 79 includes six D-type flip/flops 90(a-f), one 
for each flip/flop in bank 78. Initially, the output 

IS of each of the flip/flops in the bank 79 is high. On 
the next rising edge of the clock pulse, flip/flop 
90a changes state to Q = 0 and the output on its lead 
labeled GRANT 1 goes from a high level to a low 
level. Thus a low level signal enables the output 

20 buffer 22 of the requesting processor (see Fig. 3). 
This processor is now selected and delivers data to 
BUS I as described above. A feedback signal is also 
directly applied to the CL input of flip/flop 80a by 
the Q output of flip/flop 90a. Flip/flop 80a 

25 immediately changes state and in response to this 
change, NOR gate 82 produces a positive high-level 
signal to AND gate 84 permitting clock signals to be 
provided to the clock input terminal of each 
flip/flop in bank 78. The state of flip/flop 80a 

30 will_caus£ flip/flop 90a to change its state, back to 
its initial state, at, the n^xt clock edge. 

The second example now will be provided. 
Assume that there are high level signals on the leads 
labeled Grant Request 2 and 6 respectively. The 

35 rising edge of clock signal <t> causes flip/flops 80b 
and 80f to change state. The output of NOR gate 82 
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goes low and AND gate 84 is disabled. At this time 
NOR gate 86f does not change state since it receives 
a high— level input from flip/flop 80b. Request 2 
will be honored before request 6. NOR gate 86b 
5 changes state and provides a high— level input to the 
D terminal of flip/flop 90b. On the next rising edge 
of the clock signal, the flip/flop 90b changes state 
and the grant signal on Grant 2 lead goes from a high 
to a low level. This falling edge causes the 

10 transfer of data from the buffer 22 of the selected 
processor onto BUS I. It also provides a feedback 
signal to the CL terminal of flip/flop 80b which 
changes state, causing NOR gate 86b to produce a 
low-level output. The state of flip/flop 80b will 

15 cause flip/flop 90b to change its state, back to its 
initial state, at the next clock edge latching the 
data on BUS I into the selected memory unit, if the 
memory unit is not busy. It should be noted that NOR 
gate 82 still produces a low— level output. However, 

20 the change of state of flip/flop 80b causes NOR gate 
86f to provide a high-level signal to flip/flop 90f. 
On the next rising edge of the clock, flip/flop 90b 
changes state as discussed above and flip/flop 90f 
changes state. The grant signal on the Grant 6 lead 

25 changes from a high to a low level. Thus, processor 
6 is now connected to BUS I. A feedback signal from 
flip/flop 90f clears flip/flop 80f which through NOR 
gate 86f sets up flip/flop 90f to change state at the 
next clock edge. NOR gate 82 now enables AND gate 84 

30 and thel^ext set of request signals are ready to be 
latched into bank 78 on the next rising edge of 
signal <J>. 

Arbitor circuit 32 is identical in 
construction to circuit 30 and so this circuit need 
35 not be shown in detail. Both of these arbitor 
circuits provide close to equal priority to 
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processors a 4 nd memory units in gaining access to 
their data buses, when viewed over several cycles of 
access . 

Industrial Applicability and Advantages 
S Multi-Processor apparatus can be used for 

efficiently forming a digital image from a 
photographic negative. Such a digital image can be 
used in an output laser printer which makes prints. 
An advantage of this invention is the 
10 provision of an efficient arbitor circuit which 



IS 



20 



controls access to a bus and provides substantially 
equal priority in gaining access to a bus to all 
requesting units. 



25 
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Claims : 

* 1. Multi-processor apparatus for 
processing digital signals, characterized by: 

a. an array 'of separately addressable 

5 memory units, each memory unit including input and 
output data storage means; 

b. an array of separately addressable 
processors, each including input and output data 
storage means; 

10 c. first data transfer means including a 

first data bus, and means for transferring data from 
a selected processor output data storage means to the 
input data storage means of a selected memory unit 
via such first data bus; and 
15 d. second data transfer, means including a 

second data bus, and means independent of said first 
data transfer means for controlling the transfer of 
data from a selected output data storage means of a 
memory unit to a selected input data storage means of 
20 a processor via such second data bus. 

2. Multi-processor apparatus as set forth 
in claim 1 comprising 

a. data transfer controller means including 
first arbitor means for controlling the transfer of 
25 data on said first bus in accordance with a 

predetermined sequence and second arbitor means 
independent of said first arbitor means for 
controlling the transfer of data on said second bus 
in accordance with a predetermined sequence. 
30 ~ ~^3. Multi-processor apparatus as set forth 

in claim 2 wherein: 

a. each memory of said array of separately 
addressable memory units provides a request signal 
when it is ready to transfer data; 
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b. each processor of said array of 
separately addressable processors provides a request 
signal when it is ready to transfer data; 

c. said first and second data bus are 
5 unidirectional; and 

d. said first arbitor circuit is 
responsive to processor request signals for 
controlling the transfer of data on said first bus 
and said second arbitor circuit is responsive to 

10 memory unit re q u e s tT~sl g n a Is - f or contro ITTng ~tEe~~~~~ 
transfer of data on said second bus . 

4. The invention as set forth in claim 3, 
wherein each said arbitor circuit includes: 

(i) latch means responsive to request 
15 signals for storing such signals; 

(ii) means for inhibiting said latch means 
after such request signals have been stored from 
further storing of request signals until all such 
stored request signals have been serviced; and 

20 (iii) means responsive to said stored 

request signals for sequentially providing access to 
the bus for the transfer of data. 

5. An arbitor circuit which provides 
substantially equal priorities for access to a bus 

25 for devices which provide request signals, 
characterized by: 

(a) latch means responsive to request 
signals from devices for storing such signals; 

(b) means for inhibiting said latch means, 
30 after sucji request signals have been stored from 

further storing request signals until all such scored 
request signals have been serviced; and 

(c) means responsive to said stored request 
signals for sequentially providing access to thF bus 

35 for the requesting devices in a predetermined 
sequence . 
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6. Multi— processor apparatus for 
processing digital image. data, comprising: 

a. an array of separately addressable 
memory units for storing a digital image, each 
including input and output data storage means, each 
memory unit providing a request signal when it is 
ready to transfer digital image data; 

b. an array of individually addressable 
processors, each including input and output data 
storage means, each processor providing a. request 
signal when it is ready to transfer digital image 
data ; 

c. -a first unidirectional data bur. fcr 
transferring digital image data from a selected 
output data storage means of a processor to a 
selected input data storage means of a memory unit; 

d. a second unidirectional data bus for 
transferring digital irncgc data from a selected 
output data storage means of a memory unit to a 
selected input data storage means of a processor; and 

e. data transfer controller means 
including a first arbitor circuit responsive to 
processor request signals for controlling the 
transfer of digital image data on said first bus and 
a second arbitor circuit independent of said first 
arbitor circuit and responsive to memory unit request 
signals for controlling the transfer of digital image 
data on said second bus. 
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